usmap PackageI wanted to map data from the “Incarceration Trends” #tidytuesday project on a geographic map after being inspired by a more traditional graphical representation from lizwillow on GitHub. In addition to the clarity in the data vis, I also like to explore national and state-level data through comparing national metrics to the states I have lived in.
I ran into issues with the map packages I had used previously, so I got the idea to start with a new-to-me package as a way to help my brain detach from whatever it was that kept me from getting what I wanted out of old code I had used for a few specific purposes. I thought I could pay better attention to detail if I knew I was learning a whole new package.
I chose usmap package because it includes capacity to map Alaska and Hawaii unlike some other map packages I have used before. Even though much of the mapping I do these days is centered on the state of Oregon, I think it’s good to practice a map package that has more than the contiguous 48. Download the usmap documentation at https://usmap.dev/.
I start by practicing a blank / boundary-only map of the fifty (50) states of the U.S.A. The code for this is from the usmap documentation, but it’s good to start simple.
Using area data contained in the usmap package:
I also wanted to map newer poverty rate data onto the county map using usmap package.
Documentation for 2019 poverty data from Small Area Income and Poverty Estimates (SAIPE) Program available here.
Let’s see if it’s that easy when I scale it up to map county rates of poverty onto the entire U.S. map.
In the first attempt, I removed the labels argument, so it defaults to labels = FALSE. I removed the include = argument, so all fifty (50) states show up on the map, and I adjusted the plot labs, so I wouldn’t get confused. The result? I got a map that only has state outlines. Worse, it had all states the same saturation of grey fill.
In the second attempt, I filtered out the fips = 0, which is for the entire U.S. and filtered out the fips with “000” in attempt to remove all the state totals. That did a pretty good job, but I noticed there are several county-like polygons that got swept up in the removed data which resulted in some counties with grey fill rather than a data-scaled fill.
I got it in the third attempt where I only removed the the row where fips == 0.
Success! Well, success enough to move on to my originally intended data.
Original Endnote from 06/12/2021:
But I’m tired, so I’ll get to that next week. I would like to get to it tomorrow because I’m excited about it, but I am in the midst of a move, so I expect I will be pretty well occupied.
I bring in the data from the Tidy Tuesday repo,
and calculate and plot the national and by-state averages.
Though the documentation for usmap package outlines use of fips() to return a complete listing of state and county fips. This did not work it returned a vector of fifty-one (51) two-digit (2) codes, the codes for the 50 states and Washington D.C. Maybe it’s an issue with the R 4.0.0 version of the package. Whatever it is, onto the next option, the maps package.
OK, my method may be a bit inelegant, but now I have the county fips merged to my Oregon data on pretrial populations. I’ll start with a map of the pretrial population rates by county in 2006, the year I first moved to Oregon.
Let’s compare that to the most recent year in the Incarceration Trends data, 2015.
Wow. Not only did the distribution of pretrial incarceration shift over the decade, but the rates themselves increased substantially. The legend for 2006 data indicates a top rate near 300, but the top end of the 2015 data is a rate of just over 450 people (per 100,000 in the county) incarcerated awaiting trial.
Let’s look at the change over time for the state overall with a simple line graph.
## `summarise()` has grouped output by 'year'. You can override using the `.groups` argument.
Hmmm. The statewide pretrial detention rates appear pretty close to each other when we look at 2006 and 2015 on this graph
Let’s look at by-county rates of pretrial incarceration in a line graph.
## Warning: plotly.js does not (yet) support horizontal legend items
## You can track progress here:
## https://github.com/plotly/plotly.js/issues/53
This is a pretty busy graph, and the data all maps, but a few areas that have me skeptical of the completeness/accuracy of this data. Wallowa County has zero (0) people listed in pretrial detention for the entire time-frame. Hood River County shows a distinct difference in the 1994 - 1999 range where the county suddenly records zero (0) people held pretrial. I would bet that these counties simply did not report in the database used for this Incarderation Trends data set.
Linn County shows a significant drop-off in Linn County in 2015, but it is not zero (0), and that county’s rates had been decreasing in the past couple years. This one may be accurate.
I am looking for a way to modify the in-map labels. I see in the usmap documentation that I can easily alter the text color of the labels (on states or on counties). I want to change the label font size and the label positioning, so the county labels no longer overlap severely enough to be unclear for anyone unfamiliar with Oregon counties. I also want a way to include/exclude subsets of labels. When I’m comparing a group of counties or states to the rest of the state or nation and am using a scale_fill option, I want to indicate the polygons of interest by labeling them and not the others.
I also want to highlight a subset of the county boundaries. This is for the same reason I want to include/exclude subsets of labels as mentioned above.
The associated .Rmd file is available at https://github.com/RAAmodeo/r_examples/tree/master/usmap.